Multimodal Translation System Using Texture-Mapped Lip-Sync Images for Video Mail and Automatic Dubbing Applications
نویسندگان
چکیده
منابع مشابه
Multimodal Translation System Using Texture-Mapped Lip-Sync Images for Video Mail and Automatic Dubbing Applications
We introduce a multimodal English-to-Japanese and Japanese-to-English translation system that also translates the speaker’s speech motion by synchronizing it to the translated speech. This system also introduces both a face synthesis technique that can generate any viseme lip shape and a face tracking technique that can estimate the original position and rotation of a speaker’s face in an image...
متن کاملA REAL−TIME LIP SYNC SYSTEM USING A GENETIC ALGORITHM FOR AUTOMATIC NEURAL NETWORK CONFIGURATION (ThuAmSS2)
In this paper we present a new method for mapping natural speech to lip shape animation in real time. The speech signal, represented by MFCC vectors, is classified into viseme classes using neural networks. The topology of neural networks is automatically configured using genetic algorithms. This eliminates the need for tedious manual neural network design by trial and error and considerably im...
متن کاملAutomatic generation of dubbing video slides for mobile wireless environment
Mobile wireless video delivery is still challenging due to its limited bandwidth and dynamic channel status. In this paper, a novel approach named Dubbing Video Slides (DVS) is proposed to cope with the bandwidth limitation problem. Based on a statistical video content importance analysis, DVS method can dynamically select and transmit representative video frames which are relatively more impor...
متن کاملSpeaker independence in automated lip-sync for audio-video communication
By analyzing the absolute value of the Fourier transform of a speaker’s voice signal we can predict the position of the mouth for English vowel sounds. This is without the use of text, speech recognition or mechanical or other sensing devices attached to the speaker’s mouth. This capability can reduce the time required for mouth animation considerably. We expect it to be competitive eventually ...
متن کاملMultimodal speaker/speech recognition using lip motion, lip texture and audio
We present a new multimodal speaker/speech recognition system that integrates audio, lip texture and lip motion modalities. Fusion of audio and face texture modalities has been investigated in the literature before. The emphasis of this work is to investigate the benefits of inclusion of lip motion modality for two distinct cases: speaker and speech recognition. The audio modality is represente...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: EURASIP Journal on Advances in Signal Processing
سال: 2004
ISSN: 1687-6172,1687-6180
DOI: 10.1155/s1110865704404259